Enhanced word classing for model M
نویسندگان
چکیده
Model M is a superior class-based n-gram model that has shown improvements on a variety of tasks and domains. In previous work with Model M, bigram mutual information clustering has been used to derive word classes. In this paper, we introduce a new word classing method designed to closely match with Model M. The proposed classing technique achieves gains in speech recognition word-error rate of up to 1.1% absolute over the baseline clustering, and a total gain of up to 3.0% absolute over a Katz-smoothed trigram model, the largest such gain ever reported for a class-based language model.
منابع مشابه
Enhanced Word Classing for Recurrent Neural Network Language
Recurrent Neural Network Language Model (RNNLM) has recently been shown to outperform conventional N-gram LM as well as many other competing advanced language model techniques. However, the computation complexity of RNNLM is much higher than the conventional N-gram LM. As a result, the Class-based RNNLM (CRNNLM) is usually employed to speed up both the training and testing phase of RNNLM. In pr...
متن کاملA Study of Word-Classing for MT Reordering
MT systems typically use parsers to help reorder constituents. However most languages do not have adequate treebank data to learn good parsers, and such training data is extremely time-consuming to annotate. Our earlier work has shown that a reordering model learnt from word-alignments using POS tags as features can improve MT performance (Visweswariah et al., 2011). In this paper, we investiga...
متن کاملConfusion Network for Arabic Name Disambiguation and Transliteration in Statistical Machine Translation
Arabic words are often ambiguous between name and non-name interpretations, frequently leading to incorrect name translations. We present a technique to disambiguate and transliterate names even if name interpretations do not exist or have relatively low probability distributions in the parallel training corpus. The key idea comprises named entity classing at the preprocessing step, decoding of...
متن کاملModeling of Nanofiltration for Concentrated Electrolyte Solutions using Linearized Transport Pore Model
In this study, linearized transport pore model (LTPM) is applied for modeling nanofiltration (NF) membrane separation process. This modeling approach is based on the modified extended Nernst-Planck equation enhanced by Debye-Huckel theory to take into account the variations of activity coefficient especially at high salt concentrations. Rejection of single-salt (NaCl) electrolyte is inve...
متن کاملMultispectral Image Classification Using Back-propagation Neural Network in Pca Domain
Recently, in classification of multispectral remote resensing image by using back-propagation neural network (BPNN), all bands of image must be used for training and classing. Disadvantage of the mentioned method not only requires more time for training and classing but also more complexity. In this paper, to decrease the mentioned disadvantage, principal component analysis (PCA) is applied to ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010